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Abstract 


The cost and safety goals for NASA’s next generation of reusable launch vehicle (RLV) 
will require that rapid high-fidelity aerothermodynamic design tools be used early in the 
design cycle. To meet these requirements, it is desirable to establish statistical models 
that quantify and improve the accuracy, extend the applicability, and enable combined 
analyses using existing prediction tools. The research work was focused on establishing 
the suitable mathematical/statistical models for these purposes. It is anticipated that the 
resulting models can be incorporated into a software tool to provide rapid, variable- 
fidelity, aero thermal environments to predict heating along an arbitrary trajectory. This 
w ork w ill support development of an integrated design tool to perform automated thermal 
protection system (TPS) sizing and material selection. 

Introduction 

Recent design experience with NASA’s X-37 has demonstrated the need for considering 
higher fidelity aerodynamic heating early in the design cycle. In the case of X-37, the 
vehicle shape was optimized for aerodynamic performance and resulted in severe 
aerodynamic heating that forced costly redesign of the nose and wing surfaces and 
lowered flight margins. The availability of higher fidelity aerothermal analysis earlier in 
the design cycle could have prevented these problems. 

Development of this technology impacts NASA’s goal of reduced cost by enabling faster 
and more optimized design cycles. Utilizing higher fidelity analyses earlier in the design 
will avoid the delay and expense of late design changes. Optimizing the thermal- 
structural and TPS design in the process will minimize vehicle weight leading to lower 
launch cost. The technology described is applicable across all next generation 
systems/architectures for any vehicle configuration and includes metallic and ceramic 
TPS and hot structures. 

There will be two phases to this effort and they are model development and model 
validation. The initial phase will be to explore and/or develop the 
statistical/mathematical methods that can be used to transform the point wise aeroheating 
predictions of current tools to yield complete aerothermal environments through a 
trajectory corridor. The approach is intended to identify statistical/math models that 
best characterize and/or model a set of sphere stagnation data. Once several acceptable 
models have been identified, they will be tested in the second phase to see if they are able 
to predict heating values within the normal trajectory of corridor values. 


2 


Description of Sphere Stagnation Data Sets 

There were two worksheets for analysis; one containing the full trajectory space (1269 
measurements) and the second with measurements only within the entry corridor (138 
measurements). For these data sets, a sphere shape configuration is used and the 
measurements are taken at the stagnation point on the sphere. For simplicity, a sphere of 
one foot radius is assumed. Generally, as the sphere radius decreases, the heat rate 
measurements will increase. 

For each case, eleven (11) variables are labeled on the worksheets. The first three 
variables (altitude, velocity, and wall temperature) are considered the basic independent 
variables. The other variables are derived from these basic three variables. For example, 
the density, pressure, and temperature are direct functions of altitude. The Mach number, 
dynamic pressure, Reynolds number, and energy variables are combinations of the 3 
basic variables, e.g. dynamic pressure equals density*veIocity*velocity. The response 
variable of interest is heat rate. 

The heating rate values in the spreadsheet were generated using MINIVER-tape 
calculations. The datasets contain values for velocity and altitude but no angle-of-attack 
values because of the sphere assumption. 

Analysis Methods 

Our goal is to begin with some rudimentary analysis on these type datasets to explore the 
behavior and relationships, correlations, summary statistics, graphics including contour 
plots and 3-D plots, robust and residual analyses. Several regression models were 
investigated including multiple linear regression, polynomial regression, step-wise, best 
subset, quadratic and cubic regression models as well as some nonlinear regression 
models. Regression analysis allows one to model the relationship between a response 
variable and one or more predictor variables. One of the useful features of a regression 
model is that it can be used to predict or estimate a future response value based on a 
given set of values of the predictor variables. 

Regression analysis results usually include the following: regression equation, predictor 
table, summary statistics, ANOVA Table, list of unusual observations, contour plots, 3-D 
plots and residual plots. To appropriately use the t-test, F-test and associated confidence 
intervals, the data are assumed to meet certain conditions. These include (1) the residuals 
(error component) are assumed to be normally distributed, (2) variation is constant 
(homoscedasity) and (3) independence. The study of unusual patterns in the residuals 
through residual analysis may indicate underlying weaknesses in the model. 
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Data Exploration 


Table 1 in the Appendix lists the 138 data cases used to develop a predictive 
model relating the response variable, heat rate, to the 10 predictor variables shown. Due 
to the large quantity of data, the full trajectory data set will note be included in the 
appendix. It should be noted that unless otherwise indicated tables and figures will appear 
in the Appendix. The summary statistics for the independent and response variables are 
provided in Tables 2 and 2a. 

Prior to the model building activities, graphical methods were used to help 
identify any underlying relationships between the variables being studied. Matrix plots 
of cross-graphs of the variables are provided in Tables 3 and 3a. The plots highlight 

• Quadratic relationship between heat rate and altitude, temp 

• Strong positive linear relationship between mach and velocity, mach and altitude, 
velocity and altitude 

• Exponential relationship between heat rate and Reynolds number. 

On the other hand, Tables 4 and 4a consist of the Correlation Matrix that specifies the 
Pearson correlation and corresponding p-values. Given the inherent relationships 
between the independent or predictor variables, a principal components analysis was 
performed to help define a set of orthogonal variables so that the first principal 
component accounts for the la rg est possible amount of the total dispersion in the data, the 
second principal component accounts for the second largest possible amount of the total 
dispersion in the data, etc. This would be beneficial in helping to identify a subset list of 
candidate predictor variables for the analysis. Tables 5a and 5b show the results of the 
analysis using the correlation matrix of the predictor variables as input to the principal 
components analysis. 

Another method that was used to identify a subset list of candidate predictor variables is 
best subset selection. Table 6 shows the results of this analysis for the corridor data set. 
Best subset selection identified altitude, velocity, mach, dynamic pressure and Reynolds 
as the top five prediction variables. 

Classification and Regression Tree (CART) based models are exploratory techniques 
for uncovering structure in data that are used for: 

• developing prediction rules that can be rapidly evaluated 

• screening variables 

• assessing the adequacy of linear models 

• summarizing large data sets for both classification and regression problems. 

Tables 7a and 7b show the results of the CAR T analysis and Figure 1 shows the resulting 
CART tree. CART selected velocity, mach, altitude, energy and dynamic pressure as the 
primary pre dicti on variables. The tr ee indicates th at for those data cases with velocity 
measurements less than 13750, the predicted values for heat rate are generally below 
thirty. Furthermore, if values of mach are below 7.7, then the predicted heat rate is 
approximately 3.909. 


The classical regression methods are often used to obtain models for prediction. The 
challenge is the development of the best mathematical expression to describe in some 
sense the behavior of a random variable of interest as a function of one or more 
independent or predictor variables. The classical regression techniques however make 
several strong assumptions about the underlying data, and the data can fail to satisfy these 
assumptions in several different ways as indicated the Analysis Methods section. 

In the case where there are one or more outliers in the data or the data may not be fitted 
well by any straight line, robust regression methods come into play. These methods 
minimize the effect of the outliers and can be useful in helping to identify the outliers in 
the data. 

Scatterplot smoothers are useful tools for fitting arbitrary smooth functions to a scatter 
plot of data points. The smoother summarizes the trend of the response as function of the 
predictor variables. All of the above analysis methods are used to explore the given data 
sets. 

Results 

Several approaches were used in the exploration and identification of 
statistical/mathematical models for the given data sets including the classical multiple 
regression, classification and regression trees (CART), and other advanced analysis 
methods. One of the interesting results concerns a comparison of some initial multiple 
regression model types using only the independent predictor variables for both full and 
corridor data sets. These results are summarized below in Table A that includes Data, 
Coefficient of Determination R , Adjusted R , fit standard error S, F-statistics and the 
model type. 

Table A: Comparison of Model Types for Full and Corridor Data using Three (3) 
Independent Predictor Variables 


Data 

R 2 

Adi R 2 

Std Error 

F-statistic 

Model TvDe 

Full 

48.3% 

48.2% 

196.5 

394.15 

Linear 

Corridor 

85.2% 

84.9% 

8.72 

257.21 

Linear 

Full 

84.4% 

84.4% 

108.0 

1140.45 

Quadratic 

Corridor 

97.6% 

97.5% 

3.54 

895.68 

Quadratic 

Full 

85.6% 

85.5% 

103.9 

937.21 

Cubic, Interact 

Corridor 

99.7% 

99.6% 

1.34 

4751.89 

Cubic, Interact 


In reviewing Table A, an obvious conclusion is that the regression models appear to be 
more appropriate for the corridor data than the full trajectory data for all model types. 
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For every model type comparison in Table A, there are higher R2 and adjusted R2 values 
and lower standard error values for the corridor data set. A more in-depth analysis was 
conducted on the corridor data set as it more realistically simulates possible entry 
trajectory and heating rates of an experimental space vehicle. The structured approach 
that was used for model identification consists of the following: (1) analyze the summary 
statistics results for errors and consistency, (2) conduct preliminary analysis using only 
the independent predictor variables, (3) use graphical methods to identify underlying 
relationships, (4) employ methods (Best Subset Selection, Principal Components, etc..) 
that aid in identifying most likely additional predictor variables and (5) specify a model 
using classical regression, classification and regression trees (CART) and other advanced 
statistical methods. The results are summarized in Table B below that includes rank, (R 2 ), 
adjusted R , fit standard error (S), F-statistics and the model type. Additional analysis 
and results on the number 1 , number 2 and number 6 ranked models are provided in the 
Appendix in Tables 8 to 16 and Figures 2 to 12. Regression model ranked 6 has all 
predictor variables except altitude, velocity and Twall. 

Table B: Identification of Model Types for the Corridor Data Set 


Rank 

1 

2 

3 

4 

5 

6 

7 

8 


R 2 

Adi R 2 

Std Error 

F-statistic 

Model 

99.8 

99.7 

1.130 

4140.87 

Quad&Interacts (5V) 

99.7 

99.6 

1.348 

4751.89 

Cubic, Interact. (3 V) 

97.6% 

97.5% 

3.54 

895.68 

Quad& Inter (3 V) 

96.77 

— 

4.1533 

483.15 

CART 

99.4 

99.3 

1.851 

1665.11 

Cubic WO Indep (7V) 

99.4 

99.4 

1.724 

3843.80 

Linear (6V) 

99.1% 

99.0% 

2.209 

2798.68 

Linear 1 (5V) 

85.2% 

84.9% 

8.72 

257.21 

Linear (3 V) 


Conclusions 


A number of methods were considered in this analysis including classical multiple 
regression, polynomial regression, classification regression trees (CART), principal 
co mponen ts, correlation mat ri x and resi dual ana lysis. Several graphical methods were 
nsed in model deve lopment and assessing model adequacy. In addition, several 
techniques were used to screen/identify underlying relationships. For example the matrix 
plot in Tables 3 and 3 a suggested the inclusion of quadratic and interaction terms in our 
models. Whereas, principal components and best subset selection were used to screen 
and identify the main predictor variables. All of these methods helped to guide us in the 
selection of the predictor varia bles in our models. Using all of the above methods, 
several promising candidate models have been developed that may be used to predict the 
response variable, heat rate for the entry corridor data set. In the next phase of our 
research, validation of the adequacy of our models and other advanced methods will be 
explored. 
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APPENDIX 



Table 1: Corridor Data Set 
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Table 2a: Summary Statistics Full Data 





Table 3: Matrix Plot for Corridor Data 
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Table 5b: Principal Components Analysis 
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Table 6: Best Subset Selection 


Response is Heat Rate 
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Table 7a: CART MODEL 





Table 7b: CART MODEL 







Figure 1. Classification and Regression Tree for Corridor Data 





Table 8: Regression Analysis: Heat Rate (Corridor Data 3V) 





Table 9: Analysis of Variance (3 V) 
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Table 10: Unusual Observations (3V) 
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Figure 2. Histogram of Residuals 
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Figure 3. Normal Probability Plot (3V) 
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Figure 4. Residual Plot (3V) 
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Table 11: Regression Analysis for Heat Rate 
(5V Corridor Data) 

The regression equation is 

Heat Rate, BTU/ft2/s = - 815 - 2.71 Altitude, kft 

+ 0.0521 Velocity, ft/sec + 3.59 Temp, deg R 

- 1.45 Dyn. Pres. Ib/ft2 +0.000648 Reynolds per ft + 
0.00774 Alt*2 

- 0.00314 Temp*2 -0.000000 Reyn*2 -0.000101 Alt*Vel 
-0.000052 Vel*Temp +0.000055 Vel*DyP +0.000283 Temp*DyP 
+0.000002 DyP*Reyn 


Predictor 

Coef 

SE Coef 

T 

P 

Constant 

-814 . 9 

187.1 

-4.36 

0.000 

Altitude 

-2.7076 

0.4786 

-5.66 

0.000 

Velocity 

0.052053 

0.005991 

8.69 

0.000 

Temp, 

3.5875 

0.5892 

6.09 

0.000 

Dyn . Pre 

-1.4525 

0.3884 

-3.74 

0.000 

Reynolds 

0.0006481 

0.0001496 

4.33 

0.000 

Alt*2 

0.007743 

0.001294 

5.98 

0.000 

Temp* 2 

-0.0031392 

0.0005719 

-5.49 

0.000 

Reyn *2 

- 0.00000000 

0.00000000 

-5.65 

0.000 

Alt*Vel 

-0.00010130 

0.00001424 

-7.12 

0.000 

Vel*Temp 

-0.00005249 

0.00000845 

-6.21 

0.000 

Vel*DyP 

0.00005516 

0.00000750 

7.36 

0.000 

Temp*DyP 

0.0002828 

0.0006029 

0.47 

0.640 

DyP*Reyn 

0.00000222 

0.00000035 

6.42 

0.000 

S = 1.130 

R-Sq = 

99.8% R- 

Sq(adj) = 99. 

.7% 






Table 13: Unusual Observations (5V Corridor) 









Figure 8. Normal Probability Plot (5V) 
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Table 14: Regression Analysis Without Alt., Vel & 

Twall 


The regression equation is 

Heat Rate, BTU/ft2/s - 241 - 0.0266 Mach*3 +1.68 Mach*2 - 33.2 
Mach 

—0.000030 DynPress*3 + 0.0133 DynPress*2 - 2,99 Dyn. Pres 

lb/f t2 

-0.000000 Energy*2 +0,000117 Energy, ft3/sec 

- 43.5 Pressure, lb/ft2 +26874067 Density, slugs/ft3 

- 0.0589 Temp, deg R +0.000453 Reynolds per ft 


Predictor 

Coef 

SE Coef 

T 

P 

Constant 

241.19 

61.32 

3.93 

0.000 

Mach* 3 

-0.026629 

0.004115 

-6.47 

0.000 

Mach *2 

1.6772 

0.2470 

6.79 

0.000 

Mach 

-33.223 

4.995 

-6.65 

0.000 

PynPress 

-0.00002972 

0.00000566 

-5.25 

0.000 

DynPress 

0.013323 

0 . 002475 

5.38 

0.000 

Dyn. Pre 

-2.9908 

0.5574 

-5.37 

0.000 

Energy +2 

- 0.00000000 

0.00000000 

-12.94 

0.000 

Energy, 

0. 00011696 

0.00001050 

11.06 

0.000 

Pressure 

-43.51 

23.52 

-1.85 

0.067 

Density, 

26874067 

16016537 

1.68 

0.096 

Tamp, 

-0.05891 

0.08058 

-0,73 

0.466 

Reynolds 

0.0004529 

0 . 0001233 

3.67 

0.000 

S = 1.651 

R-Sq = 

99.4% R-Sq(adjj) = 

99.3% 
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Table 15: Analysis of Variance (7 V) 
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Figure 10. Histogram of Residuals (7V) 
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